Semi-automated ontology generation within OBO-Edit
نویسندگان
چکیده
MOTIVATION Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration. RESULTS We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent-child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent-child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child-ancestor relations can be retrieved. There is no other validated system that achieves comparable results. By combining the prediction of high-quality terms, definitions and parent-child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers. AVAILABILITY DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
OBO-Edit - an ontology editor for biologists
UNLABELLED OBO-Edit is an open source, platform-independent ontology editor developed and maintained by the Gene Ontology Consortium. Implemented in Java, OBO-Edit uses a graph-oriented approach to display and edit ontologies. OBO-Edit is particularly valuable for viewing and editing biomedical ontologies. AVAILABILITY https://sourceforge.net/project/showfiles.php?group_id=36855.
متن کاملبررسی هستان شناسی های توسعه یافته مبتنی بر اصول هستان شناسی های منبع باز زیست پزشکی
Background and Aim: Ontologies facilitate data integration, exchange, searching and querying. Open Biomedical Ontologies (OBO) Foundry is a solution for creating reference ontologies. In this foundry, the design of ontologies is based on established principles which allow for their interactions as a single system. The purpose of this study is to determine the main features of ontologies develop...
متن کاملTRAK ontology: Defining standard care for the rehabilitation of knee conditions
In this paper we discuss the design and development of TRAK (Taxonomy for RehAbilitation of Knee conditions), an ontology that formally models information relevant for the rehabilitation of knee conditions. TRAK provides the framework that can be used to collect coded data in sufficient detail to support epidemiologic studies so that the most effective treatment components can be identified, ne...
متن کاملThe 10 Annual Bio-Ontologies Meeting
There is a strong need to map the OBO format to OWL and provide tools that enable end users to easily perform this translation. To fulfill this need, the National Center for Biomedical Ontology created the NCBO OBO to OWL mapping and a set of tools to perform the translations for the Protégé and OBO-Edit editors and for command line use. A group of OBO developers and OWL experts worked cooperat...
متن کاملExtending ontologies by finding siblings using set expansion techniques
MOTIVATION Ontologies are an everyday tool in biomedicine to capture and represent knowledge. However, many ontologies lack a high degree of coverage in their domain and need to improve their overall quality and maturity. Automatically extending sets of existing terms will enable ontology engineers to systematically improve text-based ontologies level by level. RESULTS We developed an approac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 26 شماره
صفحات -
تاریخ انتشار 2010